Multifractal Analysis of Polyalanines Time Series 
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Multifractal properties of the energy time series of short a-helix structures, specifically from a 
polyalanine family, are investigated through the MF-DFA technique (multifractal detrended fluctu- 
ation analysis). Estimates for the generalized Hurst exponent h(q) and its associated multifractal 
exponents r(q) are obtained for several series generated by numerical simulations of molecular dy- 
namics in different systems from distinct initial conformations. All simulations were performed 
using the GROMOS force field, implemented in the program THOR. The main results have shown 
that all series exhibit multifractal behavior depending on the number of residues and temperature. 
Moreover, the multifractal spectra reveal important aspects on the time evolution of the system 
and suggest that the nucleation process of the secondary structures during the visits on the energy 
hyper-surface is an essential feature of the folding process. 
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I. INTRODUCTION 

Over the past few years, the statistical analysis of self- 
similar time series has become established as an impor- 
tant tool for investigating several natural phenomena. In 
general, a large part of these studies has been devoted to 
characterizing the complex statistical fluctuations shown 
by these series. Such fluctuations are associated to long- 
range correlations among the dynamic variables present 
in these series, and which obey the behavior usually de- 
scribed by fractal power- law decay [l[ . 

One of the difficulties encountered in these investiga- 
tions is related to the fact that the series may contain het- 
erogeneous properties imposing certain statistical trends 
over itself. In other words, these series are not station- 
ary . Therefore, it is necessary to employ a technique 
capable of accounting, for this, since these trends may 
influence the correlations that exist in the series. 

Two techniques have proved successful in eliminating 
these trends in time series: the wavelet transform mod- 
ulus maxima (WTMM) H, i| and the detrended fluctu- 
ation analysis (DFA) Q. Both techniques are based on 
local polynomial regression in order to eliminate local 
trends present in different segments of the series. The 
DFA technique has been particularly efficient for a large 
range of areas such as: DNA sequences @; heartbeat 
analysis Q ; economy [§| ; seismology [Io[ ; meteorology 
[TT| | ; astrophysics [l2j|, among others. 

Basically, the option of applying the DFA technique to 



these studies stems from its easy implementation. More- 
over, it is a tool that allows the role of trends in stationary 
time series to be analyzed, as well as efficiently estimat- 
ing the long-range correlations through a single param- 
eter: the scale exponent a. Is important to emphasize 
that the type of correlation present in the stationary se- 
ries depends on the value found for exponent a. In this 
way, for a = 0.5 the signal is uncorrelated (white noise 
or Gaussian), while for a < 0.5 there is anti-correlation 
(anti-persistence) and for a > 0.5 there is correlation 
(persistence) [l3f . 

Several attempts to apply the DFA technique to non- 
stationary time series (series affected by local trends) 
have not provided satisfactory results. Fundamentally, 
this has occurred because these series are not entirely 
characterized by a unique scale exponent a, since differ- 
ent segments possess fluctuations characterized by dis- 
tinct values of a. In this case, the correct formalism for 
obtaining the distribution of scale exponents is multifrac- 
tal analysis (l3j . 

Recently, the number of works focusing on the char- 
acteristics of the multifractal aspects of non-stationary 
time series has grow n, particularly those based on ex- 



perimental data [14j . Outstanding among these are the 
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applications of the generalized DFA technique, known as 
Multifractal Detrended Fluctuation Analysis (MF-DFA) 
as proposed by Kanterlhardt and collaborators [15| , for a 
wide range of _applications such as: DNA sequences [lj|, 
meteorology [l7|, seismology [l8[ and others. It should 
also be remembered that two factors influence the use 
of the MF-DFA technique: its effortless implementation, 
and its excellent performance in obtaining results, re- 
lated to both artificial and real data, when compared to 
the performance of the Wavelet Transform process, for 
the same systems (I3 |. 
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One highly relevant problem in molecular biology 
within this context, is linked to studies concerning the 
protein folding process through the characterization of 
its potential energy landscape. Such landscapes consti- 
tute a satisfactory representation of the potential energy 
for interaction among the various system's microscopic 
freedom degrees [l9| . 

In general, the adopted strategies are based on the as- 
sumption that the energy landscapes of proteins are com- 
plex, since being time dependent, they present a rugose 
structure, with many maxima and minima separated by 
barriers of varying heights. These properties imply com- 
plex evolutionary dynamics, in which the system experi- 
ences a variety of time scales [2(J Hl[ . 

Previous studies, using molecular dynamics simula- 
tions (MOIL program) and a variational method of frac- 
tal analysis to study the fractal properties of time series 
of the potential energy of molecular systems such as myo- 
globin; polyalanines, among others, were conducted by 
Lidar and collaborators [22| • Basically, they investigated 
systems that were subjected to a temperature T — 300K 
and a simulation time in the range 10<t<25ps. 

Their results suggest that the value of the fractal di- 
mension (the rugosity exponent) slightly depends on tem- 
perature and the presence of a-helix structures smoothes 
the rugosity of the series. Furthermore, there was evi- 
dence of universal behavior, i.e. the rugosity of different 
systems is described by the same fractal dimension. Re- 
cently, Hegger et al. [19| analyzed time series extracted 
from molecular dynamics simulations (GROMACS pro- 
gram) at a temeperature of T = 300K, for polyalanines 
with the number N of amino acids ranging between 3 
and 10, reaching simulation time of 100 ns. 

Considering that this time series represent the dynamic 
trajectories followed by the system, these authors found 
that the effective fractal-dimension of such trajectories 
decreases with the chain size. According to them, such 
behavior occurs due to a stabilizing effect of the hydro- 
gen bonds on the protein secondary structure (a-helix) 
smoothing rugosities on the trajectories. Confirming 
whether this scenario is able to survive careful fractal 
analysis, searching for fine details of the time series fluc- 
tuations, has become a central problem to be clarified. 

The present work introduces an approach, which com- 
bines molecular dynamics simulations with MF-DFA to 
characterize the rugosity of potential energy profiles, for 
polyalanine molecules. By considering these profiles as 
energy time series, we investigate the effects produced 
on the trajectories traced over the hyper-surface of the 
potential energy, when the size and temperature of the 
system is changed. In particular, we will show that the 
manner in which the system visits the phase-space in its 
dynamic evolution significantly depends on both temper- 
ature equilibrium and the nucleation of secondary struc- 
tures in polyalanines. 

This article is organized as follows: Section II presents 
the molecular dynamics simulations, the energy time se- 
ries, and the energy dependence on temperature T and 



number of amino acids N . Section III presents the mul- 
tifractal spectra, obtained from the MF-DFA technique, 
associated to each different polyalanine time series. The 
effects caused by changes on the size of the chain and 
temperature and the presence of secondary structures on 
the spectra are discussed. Finally, Section IV presents 
our conclusions. 



II. MOLECULAR DYNAMICS AND TIME 
SERIES 

Molecular dynamics simulations have been extensively 
used to study the problem of protein folding [23[. In 
general, these simulations involve considerable compu- 
tational effort, since the integration of the equations of 
motion must be made for a system with many particles. 

In the case of molecular systems, it is known that such 
structures can take on a great number of configurations, 
which grow with the number of degrees of freedom of 
the system. Therefore, molecular dynamics calculations 
for protein systems necessarily require the definition of 
effective potentials, from which the resulting force that 
acts on each particle is determined. 

In this work, the numerical molecular dynamics calcu- 
lations were performed with the aid of an efficient com- 
puter code: the THOR program [25| , developed to inves- 
tigate structures of biological interest, such as proteins. 
The code includes the GROMOS force field in its 
architecture, used to simulate the atomic interactions in 
the molecule. 

In the THOR program, the conformational energy of 
the molecule is made up of a sum of bonded and non- 
bonded terms. In such approach, only hydrogen atoms 
covalently bonded to oxygen or to nitrogen are consid- 
ered explicitly, whereas CHI, CH2, and CH3 groups are 
assumed to be an atomic unit. Thus, we analyze the 
changes of the following energy function: 



E — Eh + Eg + E$ + E v + Elj + E e j 
or explicitly, 
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where Eh is the Hook potential, Eg is the angular po- 
tential, Ecf, is the umproper potential, E v is the dihedral- 
angle potential, E^j is the Lennard-Jones potential, and 
E e i is the Coulomb potential term (see definitions and 
used parameters in [3, [HI ) . 
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be emphasized that for all temperature values, calcula- 
tions were reached with values of N between 8 and 18 
residues, the results of which display similar behavior to 
those presented in Figure (P). 



III. MF-DFA - MULTIFRACTAL SPECTRA 

Once the time series of polyalanines have been ob- 
tained and the presence of rugosity has been observed, 
a careful characterization of the statistical fluctuations 
embedded in the series should be performed in order to 
obtain information on the dynamic behavior of the sys- 
tem. In this work, the MF-DFA method is applied, along 
the following steps [l5j : 



FIG. 1: Potential energy time-series of polyalanines with dif- 
ferent numbers of residues (a) N=10 (black), (b) N = 15 (red), 
and (c) N = 18 (blue). In all cases the thermal equilibrium 
temperature is T — 300K. 



Specifically, we simulate polyalanine structures with 
a different number of residues at different equilibrium 
temperatures and initial conformations. Polyalanines are 
used as prototypes to study the folding process of struc- 
tures in a-helix conformations. In this dynamic, the elec- 
tric dipole moments arising from the electric unbalance 
between the peptide bond of NH and CO groups, the hy- 
drogen bridges bonds and the van der Waals interactions, 
arc key ingredients in the cooperative effect responsible 
for forming such structures, and which becomes acceler- 
ated with the increasing number of amino acids in the 
protein. 

Thus, as pointed out by Shoemaker and collabora- 
tors [26j] , Moret and collaborators [27| and Rogers [28[, a 
critical minimum number of amino acids is necessary so 
that these configurations may be observed. Furthermore, 
there is an upper critical number due to destabilization 
brought on by entropic effects. 

For the numerical calculations, a similar protocol was 
adopted in all cases. The initial temperature started at 
T{ = IK, heating the system at a rate of 5K per step (ps) 
in order to reach the desired equilibrium temperature. 
Three equilibrium temperatures were considered: T — 
275K, T = 300K and T = 325K, in a continuous medium, 
described by a relative dielectric constant e r = 2. The 
increases in the time dynamic was 5 10 -4 ps, and for 
all simulations AT step = 5 10 s steps were performed, to 
achieve a time of the order of 25ns. In calculating the 
time, the interval associated with the thermalization of 
the system was discarded. 

Figure |T]) shows in the last 5ns of observation for 
the potential energy time series associated to polyala- 
nine structures in T = 300K and N = 10, 15, and 18. It 
was noted that in all cases examined, the series showed 
the typical rugosity observed in other complex phenom- 
ena described by self-affine time series hj. It should 



1. Consider a time series u(i), i = {l...N max }, over 
a compact support, and determine its profile (inte- 
grated set), i.e., 

i 

= $>(»-<«>]■ (3) 

k=l 

where < u > is the mean taken over the original 
series u(i); 

2. Divide the profile in N s disjointed segments of equal 
size s, calculate the local trend through a poly- 
nomial adjustment of order m, represented by the 
variable y v (i), at each segment. Since the length N 
of the series is often not a multiple of the considered 
time scale s, a short part at the end of the profile 
may remain. In order not to disregard this part of 
the series, the same procedure is repeated starting 
from the opposite end. Thereby, 2iV s segments are 
obtained altogether; 

3. Determine the fluctuation variance, 

with v = {1, N s }, associated to each segment. 
Notice that in this step, polynomial trends of the 
order m were eliminated from the profile. 

4. Calculate the mean values of all segments, to obtain 
the fluctuation function of the order q: 

{2N 3 "J l / q 

— Y^[F 2 M\ q/2 \ > (5) 

where, in general, the variable q assumes real val- 
ues, except zero. 

The characteristic property of function F q (s) is its scale 
behavior, i.e. if the time series u(i) possess long-range 
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correlations, then for increasing values of s, function 
F q (s) also grows, following a power law of the type: 



F q (s) 



Mi) 



(0) 



Therefore, the main result obtained with the MF-DFA 
method is a family of exponents h(q), called the gener- 
alized Hurst exponents. For a genuine multifractal series 
these exponents form a decreasing function of q, if on the 
other hand, the signal is monofractal h (q) = constant. 
Moreover, if q < 0, h(q) captures the properties of small 
fluctuations, then for q > large fluctuations are domi- 
nant. Particularly, when q = 2, h(2) = H is the classical 
Hurst exponent. 

Finally, the multifractal spectrum of measures can be 
obtained through a simple relation between the exponent 
h(q) and the multifractal scale exponent r(q), defined by 
multifractal formalism 151: 



r(q) = qh(q) - 1. 



(7) 



The function r(q) is one of the most used representations 
of multifractal spectra, related to time series. 

Furthermore, typical results are presented obtained us- 
ing the MF-DFA technique to investigate the different 
time series of the potential energy of polyalanines as de- 
scribed in Section II. Figure © represents the behavior 
of the logarithmic of the fluctuation function logF 9 (s) as 
a function of log s and the parameter q, for the series with 
N = 18 residues shown in Figure ([1]). The scale values 
were chosen in the range 20 < s < 100 and the trends 
were approximated by a polynomial of order m = 4. As 
can be observed, the estimates obtained for the linear 
adjustment of the data satisfactorily meet the behavior 
of the scale provided by Equation ([6]). 



data) as a function of q, while Figure (|3Jd) presents the 
associated multifractal spectrum r{q). In general terms, 
it may be stated that the results indicate that the time 
series investigated exhibit typical multifractal behavior 
(r(q) is not a linear function of q), which depends on the 
number of residues N and the thermal temperature T of 
system. 

In Figure ([3]) different regimes of correlation may be 
observed: for N = 17, the series is completely correlated; 
while for N = 10, 15 and 18, there is a mixed system, 
i.e. a strong anti-correlation when q > and correlation, 
for some values of q < 0. In particular, when TV = 13 the 
series is totally anti-correlated. According with reference 
[27] | N = 13 is the critical number of residues associated 
with the formation of a- helix in T = 300K. 
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FIG. 3: (a) Generalized Hurst exponents h(q) dependence 
on the parameter q and (b) multifractal spectrum r(q) de- 
pendence on q for polyalanines with a different number of 
residues, thermal equilibrium temperature of T = 300K. 

Since for this value of N, the potential energy is a 
global minimum then we may consider that the nucle- 
ation of secondary structures alters the dynamics of the 
system for an anti-correlated regime, thus overcoming 
the growth trend of the energy induced by thermal ag- 
itation and the increase of residues. In addition, as in 
Figure (J3]b) , all spectra r(q) exhibit typical multifractal 
properties. 



IV. CONCLUSIONS 



FIG. 2: Logarithmic of the fluctuation function F q (s) against 
logs with the q parameter — 5 < q < 5 (step 1 from top 
to bottom) for polyalanine energy time series with N — 18 
residues, and thermal equilibrium temperature of T = 300K. 

Figure ((3^) presents the corresponding exponents h(q) 
(values of the slopes of the straight lines fitted to the 



In this work, we have studied the multifractal proper- 
ties of time series of the potential energy of polyalanines. 
Protein chains were analyzed with different numbers of 
residues at three equilibrium temperatures. The research 
was conducted using an approach that combines molec- 
ular dynamics with MF-DFA, a technique of statistical 
analysis, which enabled us to characterize the rugosity 
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associated with the temporal correlation among the dy- 
namic variables of the series. 

Our results corroborate some of those obtained by Heg- 
ger et. al [l9| and Lidar and collaborators [22| , such as the 
influence of time and the presence of the a-helix in the 
rugosity of the time series. However, they also indicate 
that the other findings have not been confirmed, since the 
time simulation they used is much shorter than that used 
in this study and so insufficient to observe the formation 
of secondary structures. Also, the fractal analysis tech- 
nique employed by these authors, which does not deal 
adequately with the existence of trends in the series, has 
not allowed to capture the subtler details of the spectra. 

Indeed, the results obtained in this study indicate that 
all the series examined exhibit typical multifractal be- 
havior, which depends both on the number of residues 
N, and the temperature T of the system, and that these 
multifractal properties, represented by r(q) spectra or, 
similarly, the generalized Hurst exponents h(q), reveal 
important aspects of the temporal evolution of the sys- 
tem. 

It was found, for example, that when the number of 
residues approaches the critical number of residues N c , 
associated with the formation of a significant amount of 
secondary structures, the temporal correlation regime of 
the system is changed. In the case N c = 13 and T — 
300K, the system is totally anti-correlated, the spectrum 
r{q) is truly multifractal and rugosity is more pronounced 
in the region of small fluctuations (q < 0), as seen in 
Figure ([3^,). For other values of N, the results confirm 
that the two regimes of correlation are present in the 
series. 

Recently, Moret and collaborators [3(| conducted an 
analysis of the spectra (profiles) of the potential energy 
of proteins, in function of the number of dihedral angles 
4> and and found that these profiles are described by a 
real multifractal f(a) spectra. They also found that the 
/(a) spectra were sensitive to the number of degrees of 
freedom of the system, thus illustrating that the dimen- 
sionality of the phase space influences the accessibility of 
parts of the hyper-surface of the potential energy, since 
the proteins adopt conformations in the phase space only 



in the permitted regions of the spectrum f(a). 

This behavior allows an alternative explanation for the 
dynamics of the clew of a protein, because it suggests the 
existence of preferential folding trajectories along the en- 
ergy hyper-surface, i.e. in the search for its native state, 
proteins need not visit all the accessible states in the 
space phase, but only those associated with the spectrum 
/(«)■ 

The MF-DFA method applied to the time series of 
the potential energy of polyalanines, has enabled this 
study to reveal important aspects concerning the wealth 
and complexity associated with the temporal evolution 
of these systems, in the search for its native state. 

In fact, according to the number of residues and the 
temperature, it was shown that the trajectory of the 
protein, to visit its phase space dynamically, is guided 
mainly by the influence of secondary structures, which 
are formed over the time simulation, probing the hyper- 
surface of the conformational energy at different time 
scales. As a result, the energy time series exhibit multi- 
fractal long-range correlations. 

Therefore, our results support an alternative explana- 
tion of the so-called Levinthal paradox [3l|, because in 
this scenario, the protein in its dynamic evolution, is be- 
ing influenced by the emergence of intermediate struc- 
tures, which gradually, by successive increases in con- 
formational stability, bypass the trajectories by way of 
preferential folding. Consequently, the extreme ease with 
which a protein is folded, despite the huge number of pos- 
sible configurations, may be attributed to a succession of 
events, which it experiments, on a multifractal space-time 
energy hyper-surface. 
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